Reaction � Curious machines

Greg Detre

Tuesday, April 22, 2003

Schaal (1999) and Schaal et al. (2003) combined

Schaal emphasises the importance of research into imitation learning as �[channeling] investigations in computational motor control towards the important topic of action-perception coupling�. Imitation (in its various stronger and weaker forms) seems to play a central role in learning perceptual-motor coupling by directing the search towards valuable actions in a given situation, which would give rise to learning at a much much faster rate than trial and error or statistical approaches. Or, as Schaal puts it, the only way to search huge state-action spaces is to either find more compact state-action representations or to know which bits of the space are most relevant. The imitation learning approaches he discusses can help with both.

It is with regard to the first question that he discusses �movement primitives�. I found this to be a misnomer, since the movement primitives he describes are by no means primitive or atomic, as one might expect (as in the animation of Dobie). Basically, the idea seems to be to break the total action-sequence to be described into a sequence of relatively stereotyped or common low-level actions, and then to encode them as an aggregate in the service of a goal. This way, the issue of combinatorial explosion is avoided, since movement primitives are compact and can be tailored with parameters to apply in multiple domains and slightly different situations. High-level movement primitives could rely on the via-point method, splining or feedforward models to generate the low-level commands and to move through arbitrary trajectories.

This rests on one of the most important ideas, that of �movement recognition � based on the movement generation system�. That is, the system starts by building an internal feedforward model of its own actions to try and predict what will happen when a given motor command is issued, using supervised learning. This feedforward model can then be employed when trying to imitate, to internally test which motor command will produce the desired result. When you have multiple feedforward models, which might different in the representations they employ, the limbs they emphasise or trained on different actions in the past, these can compete so that the system can choose which appears to offer the most reliable prediction in a given situation.

I wasn�t entirely happy with the distinction made between task-strategy and task-goal. This seems to be important, since he defines true imitation as being present only if:

1. the imitated behavior is new for the imitator

2. the same task strategy as that of the demonstrator is employed

3. the same task goal is accomplished

I didn�t see a definition of either �task strategy� or �task goal�. They might be analogous to the difference between action-level imitation (�the indiscriminate copying of the actions of the teacher without mapping them onto more abstract motor representation) and program-level imitation (�a process by which the structural organization of a behavior is copied from observing a teacher, while the exact details of actions are filled in by individual learning�). In other words, the task strategy would be some medium-level description of how to achieve the task, and the task goal would be a goal-state or abstract description of why the task is being performed. Unfortunately though, this distinction is problematic. After all, presumably, the task strategies we employ in later life are composed out of the simple task goals we learned in the past, implying that there is some commonality of representation, and that they might be collapsed into a single hierarchy. Another problem involves the author�s decision to stick to visually-mediated imitation, since language seems necessary in most situations to communicate and share a goal in the first place.

Finally, the notion that the goals of the student and of the teacher can be declared unproblematically to be the �same� masks a necessary mapping that has to be made between them. When imitating a tennis swing, I may have to map the goal of �his hitting that ball with his right arm holding that tennis racket� to �me hitting this ball with my left arm (perhaps) holding this tennis racket�. This is the �correspondence� problem, of mapping coordinate frames. But the problem may be deeper � I may be supposed to be imitating the direction of shot, or the facial expression or indeed anything about the situation. There needs to be some means for top-down goal-like knowledge to inform what about the teachers� lower-level actions should be considered salient. This issue of �what about the situation to imitate� is reflected in the discussion of a cost function J, which needs to capture both the task goal and the quality of imitation in achieving the task goal.

Schaal�s discussion of imitation (especially the neuroscientific evidence) almost seems to assume that there is only one motor-imitation system in the brain. We know that mirror neurons are specific to special motor behaviours with a particular object, as executed by physiologically �similar� beings. As a result, it�s conceivable that (at least in humans) there�s some other, more general imitative system that we use when imitating non-humans, or imitating arbitrary actions with uncommon limb movements. There may be more fine-grained domains of specificity for mirror neurons, e.g. for voices, facial expressions. Perhaps signature-forgers have mirror neurons that respond to certain kinds of hand-writing movements. There may, as Schaal seems to believe, be some division between imitation of a goal and imitation of a motor performance.